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ABSTRACT 

We  develop  a  new  measure  of  reliability  for  the  mean  behavior  of  a  process  by  calculating  the  probability 
that  the  cumulative  sample  mean  will  ever  deviate  from  its  long-term  mean,  and  its  true  mean,  over  a  period 
of  time.  This  measure  can  be  used  as  an  alternative  to  estimating  system  performance  using  confidence 
intervals.  We  derive  the  tradeoffs  between  four  critical  parameters  for  this  measure:  the  underlying  variance 
of  the  data,  the  starting  sample  size  of  a  procedure,  and  the  precision  and  confidence  in  the  result. 


1  INTRODUCTION 

We  propose  a  new  metric  for  evaluating  system  performance  that  is  stronger  than  the  traditional  confidence 
interval.  We  present  a  measure  of  reliability  for  the  cumulative  mean  behavior  of  a  process,  by  calculating 
the  probability  that  the  sample  mean  of  a  time  series  stays  within  some  fixed  distance  from  its  long-term 
mean  after  a  given  initial  sample  size.  The  long-term  mean  could  be  the  true  mean,  or  the  sample  mean 
after  a  long  period  of  time.  The  underlying  time  series  is  assumed  to  meet  the  conditions  for  a  functional 
central  limit  theorem  (FCLT),  an  assumption  used  in  many  simulation  output  analysis  methods. 

We  calculate  this  measure  by  structuring  simulation  output  data  as  a  standardized  time  series  (Schruben 
1983),  which  under  the  FCLT  assumption  is  a  Brownian  bridge  in  the  limit.  Manipulating  the  standardization 
allows  us  to  evaluate  the  difference  between  the  cumulative  mean  and  the  long-term  mean  as  a  function  of 
a  Brownian  bridge.  We  derive  a  lower  bound  for  the  probability  that  this  difference  between  the  means  is 
always  less  than  a  specified  amount  after  a  specified  initial  sample  size  by  calculating  boundary  crossing 
properties  of  Brownian  bridges.  This  measure  provides  more  information  than  a  traditional  confidence 
interval,  which  only  evaluates  the  cumulative  mean  once. 

In  addition  to  the  implications  for  confidence  interval  procedures,  this  measure  is  useful  in  experimental 
settings.  Examples  include  evaluating  a  production  system  over  a  year,  where  cumulative  average  perfor¬ 
mance  each  month  converges  to  average  performance  over  the  year.  We  may  be  interested  in  knowing  the 
likelihood  that  the  cumulative  performance  early  in  the  year  will  deviate  from  the  end  of  year  results. 


2  A  MEASURE  OF  RELIABILITY  FOR  MEAN  BEHAVIOR 

We  calculate  the  probability  that  the  cumulative  sample  mean  of  simulation  output  data  7  after  k  initial 
samples,  which  is  7,-,/  =  k, . . .  ever  deviates  from  its  long  term  mean  Ym,  or  its  true  mean  /./,  by  more  than 
8.  Let  <7 2  be  the  variance  of  the  7 ]  data  points,  8  be  the  allowable  deviation,  and  m  be  the  total  sample 
size.  We  write  this  measure,  standardized  by  time,  as  PB  (probability  in  bounds): 
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Our  result  gives  a  lower  bound  for  PB. 

Theorem  2.1.  Let  <I>  be  the  cumulative  distribution  function  of  the  standard  normal  distribution.  Under 
the  FCLT  assumption,  the  probability  that  the  cumulative  sample  mean  T;  stays  within  distance  8  from  the 
long-term  mean  Ym  over  the  range  i  =  k,...,m  has  a  lower  bound 
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The  lower  bound  for  the  probability  that  the  sample  mean  Yt  stays  within  distance  8  from  /./  over  the  range 
i  =  k,..., °°  as  m  — >•  °°  is 
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3  RESULTS  AND  CONCLUSIONS 

Figure  1  shows  PB;  in  the  limit  as  in  — »•  °o  for  a  variety  of  combinations  of  k  and  5/<r  using  (1).  Because 
5  and  cr  only  appear  in  (1)  as  8/(5  we  condense  them  to  one  term  on  the  y-axis.  On  the  x-axis,  as  k 
increases  we  see  that  PBi  increases,  because  the  sample  mean  is  more  likely  to  deviate  from  the  true  mean 
at  smaller  sample  sizes.  On  the  y-axis,  when  8  is  high  relative  to  a,  PBj  is  higher  because  the  bounds 
are  loose  relative  to  the  variance  of  the  procedure.  However,  when  a  is  large,  this  ratio  decreases  and  the 
probability  of  staying  within  the  bounds  decreases.  This  shows  the  importance  of  having  8/(5  relatively 
large,  for  any  value  of  k,  or  alternatively,  if  a  small  8  is  required,  to  use  a  larger  k. 
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